Near-optimal multiple-input multiple-output (MIMO) channel detection via sequential Monte Carlo

ABSTRACT

A class of soft-input soft-output demodulation schemes for multiple-input multiple-output (MIMO) channels, based on the sequential Monte Carlo (SMC) framework under both stochastic and deterministic settings. The stochastic SMC sampler generates MIMO symbol samples based on importance sampling and resampling techniques, while the deterministic SMC approach recursively performs exploration and selection steps in a greedy manner. By exploiting the artificial sequential structure of the existing simple Bell Labs Layered Space Time (BLAST) detection method based on nulling and cancellation, the proposed algorithms achieve an error probability performance that is orders of magnitude better than the traditional BLAST detection schemes while maintaining a low computational complexity. Performance is comparable with that of the sphere decoding algorithm, with a much lower complexity. Both the stochastic and deterministic SMC detectors can be employed as the first-stage demodulator in an iterative or turbo receiver in coded MIMO systems.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims the benefit of U.S. provisional patent application No. 60/450,777, filed Feb. 28, 2003, incorporated herein by reference.

BACKGROUND OF THE INVENTION

1. Field of Invention

The invention relates generally to digital data receivers.

2. Description of Related Art

The ever-increasing demand for high-speed wireless data transmission has posed great challenges for wireless system designers to achieve high-throughput wireless communications in radio channels with limited bandwidth. Multiple transmit and receive antennas are most likely to be the dominant solution in future broadband wireless communication systems, as the capacity of such a multiple-input multiple-output (MIMO) channel increases linearly with the minimum between the numbers of transmit and receive antennas in a rich-scattering environment, without increasing the bandwidth or transmitted power [1, 2, 3]. See the Appendix for references. Because of the extremely high spectrum efficiency, MIMO techniques have been incorporated into several standards of various wireless applications, such as the IEEE 802.11 a wireless LAN, the IEEE 802.16 wireless MAN, and the WCDMA standards.

The Bell-labs layered space-time (BLAST) architecture is an example of uncoded MIMO systems now under implementation. In the literature [4, 5, 6], different BLAST detection schemes have been proposed based on nulling and interference cancellation (IC), such as the method of zero-forcing (ZF) nulling and IC with ordering, and the method based on minimum mean-squared error (MMSE) nulling and IC with ordering. The performance of these simple detection strategies is significantly inferior to that of the maximum likelihood (ML) detection, whose complexity grows exponential in terms of the number of transmit antennas. In [7, 8, 9], sphere decoding is proposed as a near-optimal BLAST detection method, whose complexity is cubic in terms of the number of transmit antennas. As hard decision algorithms, the above schemes suffer performance losses when concatenated with outer channel decoder in coded MIMO systems. In [8], a list sphere decoding algorithm is proposed to yield soft-decision output by storing a list of symbol sequence candidates. However, the complexity is significantly increased compared with the original sphere decoding algorithm.

The present invention provides a new family of demodulation algorithms based on sequential Monte Carlo methods. The new algorithms may advantageously be used for soft MIMO demodulation and achieve near-optimal performance with low complexities.

The sequential Monte Carlo (SMC) methodology [10, 11, 12, 13, 14, 15] originally emerged in the field of statistics and engineering and has provided a promising new paradigm for the design of low complexity signal processing algorithms with performance approaching the theoretical optimum for fast and reliable communication in highly severe and dynamic wireless environments. The SMC can be loosely defined as a class of methods for solving online estimation problems in dynamic systems, by recursively generating Monte Carlo samples of the state variables or some other latent variables. In [10, 11, 12, 15], SMC has been successfully applied to a number of problems in wireless communications including channel equalization, joint data detection and channel tracking in fading channels, and adaptive OFDM receiver in time dispersive channels.

BRIEF SUMMARY OF THE INVENTION

The present inventor has recognized that an SMC detector may advantageously be used as the soft-input soft-output demodulator in a turbo receiver. In particular, the present invention encompasses a new class of detection schemes that may advantageously be used for soft-input soft-output MIMO detection for quasi-static MIMO fading channels.

This novel class of receivers is based on the SMC framework and can have either stochastic or deterministic settings. The stochastic SMC demodulator employs the techniques of importance sampling and resampling, where the trial sampling distribution is formed by exploiting the artificial sequential structure of the existing simple nulling and cancellation BLAST detection scheme. The deterministic SMC demodulator recursively performs exploration and selection steps to detect the final survivor signal paths with the corresponding importance weights. Being soft-input and soft-output in nature, these SMC detectors can serve as the soft demodulator in a turbo receiver for coded MIMO systems, where the extrinsic information is iteratively exchanged between the soft demodulator and the soft outer channel decoder to successively improve the receiver performance. Moreover, when the channel is unknown and is estimated by using pilot symbols, the decoded code bits with high reliability can act as pilots for channel re-estimation. By performing iterative channel estimation and data detection, the receiver performance can be further improved. The invention can be used in other applications as well, such as in decoding linear dispersion MIMO systems.

In one aspect of the invention, a method for demodulating data from a channel includes receiving a priori probability values for symbols transmitted across the channel, in accordance with the a priori probability values, determining a set of Monte Carlo samples of the symbols weighted with respect to a probability distribution of the symbols, and estimating a posteriori probability values for the symbols based on the set of Monte Carlo samples.

A related program storage device and receiver apparatus are also provided.

BRIEF DESCRIPTION OF THE DRAWINGS

These and other features, benefits and advantages of the present invention will become apparent by reference to the following text and figures, with like reference numbers referring to like structures across the views, wherein:

FIG. 1 illustrates a transmitter structure of a coded multiple-input, multiple-output (MIMO) system;

FIG. 2 illustrates a receiver structure of a coded MIMO system;

FIG. 3 illustrates a stochastic SMC MIMO demodulation process;

FIG. 4 illustrates a deterministic SMC MIMO demodulation process;

FIG. 5 illustrates bit error rate (BER) performance of various MIMO demodulation algorithms in an uncoded MIMO system;

FIG. 6 illustrates BER performance of a turbo MIMO receiver employing the stochastic sequential Monte Carlo (SMC) MIMO demodulator, where channel estimation is based on pilots only;

FIG. 7 illustrates BER performance of a turbo MIMO receiver employing the stochastic SMC MIMO demodulator, where iterative channel estimation is based on pilots and the decoded symbols with high reliability;

FIG. 8 illustrates BER performance of a turbo MIMO receiver employing the stochastic SMC MIMO demodulator, where genie-aided channel estimation is based on pilots and all information symbols;

FIG. 9 illustrates BER performance of a turbo MIMO receiver employing the deterministic SMC MIMO demodulator, where channel estimation is based on pilots only;

FIG. 10 illustrates BER performance of a turbo MIMO receiver employing the deterministic SMC MIMO demodulator, where iterative channel estimation is based on pilots and the decoded symbols with high reliability; and

FIG. 11 illustrates BER performance of a turbo MIMO receiver employing the deterministic SMC MIMO demodulator, where genie-aided channel estimation is based on pilots and all information symbols.

DETAILED DESCRIPTION OF THE INVENTION

The description is organized as follows. Section 2 describes the system under consideration. Background materials on the SMC methodology are provided in Section 3. In Section 4, we derive the new class of soft MIMO demodulation algorithms using the stochastic SMC and the deterministic SMC, based on the simple nulling and cancellations BLAST detection scheme. Computer simulation results are provided in Section 5, and conclusions are drawn in Section 6.

2 System Descriptions

In this section, we consider a generic coded MIMO system with a turbo receiver. The transmitter and receiver structures are shown in FIG. 1 and FIG. 2, respectively. In FIG. 2, IT denotes the interleaver and IT¹ denotes the deinterleaver.

2.1 Transmitter Structure

At the transmitter 100, a block of information bits {a₁} are encoded into code bits {b_(i)} at a channel encoder 10. The code bits are then randomly interleaved at an interleaver 120 and mapped to an M-PSK or M-QAM modulation symbol stream using a QPSK modulator 130, taking values from a finite alphabet set A={a₁, a₂, . . . , a_(M)}. Each symbol is serial-to-parallel (S/P) converted at an S/P converter 140 to n_(T) sub-streams via demultiplexing, and each sub-stream is associated with one of a number n_(T) of transmit antennas 150. At each time instance, one symbol from each sub-stream is transmitted from its corresponding antenna, resulting in n_(T) symbols transmitted simultaneously in the same frequency band. Such a space-time bit interleaved coded modulation (BICM) allows for a better exploitation of the spatial, temporal and frequency diversity resources available in the wireless MIMO systems [16].

We consider a MIMO system with n_(T) transmit and n_(R) receive antennas under the BLAST assumption, i.e., assuming that n_(R)≧n_(T) in order to guarantee that the transmitted data symbols can be uniquely decoded at the receiver. The wireless channel is assumed to have rich-scattering and flat fading. The fading between each transmit and receive antenna pair is assumed to be independent. The channel is also assumed quasi-static, i.e., it is static over a data burst and changes from burst to burst.

At the receiver 200, after matched filtering and symbol rate sampling at the receiver's antennas 205, the received signal vector from all n_(R) receive antennas 205 is denoted as y(i)=[y ₁(i) . . . y _(n) _(R) (i)]^(T). In complex baseband representation, the received signal can be expressed as the linear combination of the transmitted signal s(i)=[s₁(i), . . . , s_(n) _(T) (i)]^(T)

$\begin{matrix} {{{y(i)} = {{\sqrt{\frac{\rho}{n_{T}}}{{Hs}(i)}} + {v(i)}}},\mspace{14mu}{i = {1\mspace{14mu}\ldots\mspace{14mu} N}},} & (1) \end{matrix}$ where s(i)εA^(n) ^(T) , p is the total signal energy at the transmitter, HεC^(n) ^(R) ^(×n) ^(T) is the complex fading channel matrix, v(i)˜N_(c)(O, I_(n) _(R) ) is the spatially and temporally white Gaussian noise, and N is the data burst length.

Denote Ω

H^(H)H. The received signal is matched-filtered and whitened, to obtain (here for simplicity, we drop the time index i)

$\begin{matrix} {{u\overset{\bigtriangleup}{=}{{\Omega^{- \frac{1}{2}}H^{H}y} = {{\sqrt{\frac{\rho}{n_{T}}}\Omega^{\frac{1}{2}}s} + w}}},} & (2) \end{matrix}$ where

$w\overset{\bigtriangleup}{=}{{\Omega^{- \frac{1}{2}}H^{H}v} \sim {{N_{c}\left( {0,I_{n_{T}}} \right)}.}}$ Based on (2), the maximum likelihood (ML) MIMO detector is given by:

$\begin{matrix} {\hat{s} = {\arg{\mspace{11mu}\;}{\min\limits_{s \in A^{n_{T}}}{{{u - {\sqrt{\frac{\rho}{n_{T}}}\Omega^{\frac{1}{2}}s}}}^{2}.}}}} & (3) \end{matrix}$ Subsequently, the complexity of the ML receiver grows exponentially in the number of transmit antennas. Alternative sub-optimal approaches such as the method based on zero-forcing nulling and cancellation with ordering, and the method based on MMSE nulling and cancellation with ordering [1, 6], have lower complexities but also incur substantial performance loss. The sphere decoding method has recently emerged as a near-optimal BLAST detection algorithm. However, its complexity is cubic in term of the number of transmit antennas [7, 8].

2.2 Turbo Receiver

Since the combination of M-PSK or M-QAM modulation and symbol-antenna mapping effectively acts as an inner encoder in the MIMO transmitter, the whole system can be perceived as a serial concatenated system and an iterative (turbo) receiver [17, 18] can be designed for such a system as shown in FIG. 2. The turbo receiver 200 includes two stages: the soft-input soft-output SMC demodulator 210 to be developed in Section 4 followed by a soft channel decoder 260. The two stages are separated by a deinterleaver 250 and an interleaver 230.

The soft MIMO demodulator 210 takes as the input the extrinsic information λ₂[b_(i)] delivered by the channel decoder 260 in the previous turbo iteration as well as the received signals u given by (2). First, the symbol a priori probability is calculated at the symbol probability calculator 220 as follows. Assuming QPSK modulation is employed, the bit pair {β_(j,1), β_(j,2)} is mapped to symbol a_(j). Additionally, assume that in the MIMO systems, the transmitted symbol s_(k) corresponds to the interleaved code bit pair {b_(η(k,1)), b_(η(k,2))}, for k=1, . . . , n_(T). The a priori symbol probability for each symbol a_(j)εA is then given by pkj

P(s _(k) =a _(j))=P(b _(η(k,1))=β_(j,1))·P(b _(η(k,2))=β_(j,2)),  (4) where the code bit probability can be computed from the corresponding log likelihood ratio (LLR) as follows [19]:

$\begin{matrix} \begin{matrix} {{P\left( {b_{i} = 1} \right)} = \frac{\exp\left( {\lambda_{2}\left\lbrack b_{i} \right\rbrack} \right)}{1 + {\exp\left( {\lambda_{2}\left\lbrack b_{i} \right\rbrack} \right)}}} \\ {= {{\frac{1}{2}\left\lbrack {1 + {\tanh\left( {\frac{1}{2}{\exp\left( {\lambda_{2}\left\lbrack b_{i} \right\rbrack} \right)}} \right)}} \right\rbrack}.}} \end{matrix} & (5) \end{matrix}$ Note that initially the extrinsic information is set to be zero so that the QPSK symbols are equally probable.

At the output of the soft demodulator 210 are the a posteriori symbol probabilities P(s_(k)=a_(j)|u), k=1, 2, . . . , n_(T), j=1, 2, . . . , M. Denote

$S_{k}^{(j)}\overset{\bigtriangleup}{=}{\left\{ {{{\left( {s_{1},\ldots\mspace{11mu},s_{k - 1},a_{j},s_{k + 1},\ldots\mspace{11mu},s_{n_{T}}} \right)\text{:}\mspace{14mu} s_{1}} \in A},{l \neq k}} \right\}.}$ The exact expression of P(s_(k)=a_(j)|u) is given by

$\begin{matrix} \begin{matrix} {{P\left( {s_{k} = {a_{j}❘u}} \right)} = \frac{{p\left( {{u❘s_{k}} = a_{j}} \right)}{P\left( {s_{k} = a_{j}} \right)}}{p(u)}} \\ {\propto {{P\left( {s_{k} = a_{j}} \right)}{\sum\limits_{s \in S_{k^{\prime}}^{(j)}}{\exp\left( {- {{u - {\sqrt{\frac{\rho}{n_{T}}}\Omega^{\frac{1}{2}}s}}}^{2}} \right)}}}} \\ {\prod\limits_{l \neq k}^{\;}\;{{P\left( s_{l} \right)}.}} \end{matrix} & (6) \end{matrix}$ Since the summations in (6) are over all the A^(n) ^(T) ⁻¹ possible vector s in S_(k) ^((j)), j=1, . . . , M, its complexity is exponential in the number of transmit antennas and impractical for systems with high spatial-multiplexing gain. Here, we develop some low complexity algorithms for approximating (6). Based on the symbol a posteriori probabilities computed by the soft demodulator 210, the bit LLR computer 240 calculates the a posteriori log-likelihood ratios (LLRs) of the interleaved code bits b_(π(i)). Assuming that the code bit b_(π(i)) is included in a QPSK symbol s_(k)εA, the LLR of this code bit is given by

$\begin{matrix} {\begin{matrix} {{\Lambda_{1}\left\lbrack b_{\pi{(i)}} \right\rbrack}\overset{\bigtriangleup}{=}{\log\frac{P\left( {b_{\pi{(i)}} = {1❘u}} \right)}{P\left( {b_{\pi{(i)}} = {0❘u}} \right)}}} \\ {{= {\log\frac{\sum_{{{a_{j} \in {A:s_{k}}} = a_{j}},{b_{\pi{(i)}} = 1}}{P\left( {s_{k} = {a_{j}❘u}} \right)}}{\sum_{{{a_{j} \in {A:s_{k}}} = a_{j}},{b_{\pi{(i)}} = 0}}{P\left( {s_{k} = {a_{j}❘u}} \right)}}}},} \end{matrix}{{k = 1},\ldots\mspace{11mu},{n_{T}.}}} & (7) \end{matrix}$

Using the Bayes' rule, (7) can be written as

$\begin{matrix} {{{\Lambda_{1}\left\lbrack b_{\pi{(i)}} \right\rbrack} = {\underset{\underset{\lambda_{1}{\lbrack b_{\pi{(i)}}\rbrack}}{︸}}{\log\frac{P\left( {{u❘b_{\pi{(i)}}} = 1} \right)}{P\left( {{u❘b_{\pi{(i)}}} = 0} \right)}} + \underset{\underset{\lambda_{2}{\lbrack b_{\pi{(i)}}\rbrack}}{︸}}{\log\frac{P\left( {b_{\pi{(i)}} = 1} \right)}{P\left( {b_{\pi{(i)}} = 0} \right)}}}},} & (8) \end{matrix}$ where the second term in (8), denoted by λ₂[b_(π(i))], represents the a priori LLR of the code bit b_(π(i)), which is computed by the channel decoder 260 in the previous iteration, interleaved at the interleaver 230 and then fed back to the soft MIMO demodulator 210. For the first iteration, it is assumed that all code bits are equally likely. The first term in (8), denoted by λ₁[b_(π(i))], represents the extrinsic information delivered by the soft MIMO demodulator 210, based on the received signals u, the MIMO signal structure, and the a priori information about all other code bits. The extrinsic information λ₁[b_(π(i))]is then deinterleaved at the deinterleaver 250 and fed back to the channel decoder 260, as the a priori information for the channel decoder. The soft decoder 210 takes as input the a priori LLRs of the code bits and delivers as output an update of the LLRs of the coded bits, as well as the LLRs of the information bits based on the code constraints. The soft channel decoding algorithm [20] computes the a posteriori LLR of each code bit,

$\begin{matrix} \begin{matrix} {{\Lambda_{2}\left\lbrack b_{i} \right\rbrack}\overset{\bigtriangleup}{=}{\log\frac{P\left( {{b_{i} = {1❘\left\{ {\lambda_{1}\left( b_{i} \right)} \right\}_{i}}};\mspace{11mu}{{code}\mspace{14mu}{constraints}}} \right)}{P\left( {{b_{i} = {0❘\left\{ {\lambda_{1}\left( b_{i} \right)} \right\}_{i}}};\mspace{11mu}{{code}\mspace{14mu}{constraints}}} \right)}}} \\ {= {{\lambda_{2}\left\lbrack b_{i} \right\rbrack} + {{\lambda_{1}\left\lbrack b_{i} \right\rbrack}.}}} \end{matrix} & (9) \end{matrix}$ where the factorization (9) is shown in [19]. It is seen from (9) that the output of the soft decoder is the sum of the a priori information λ₁[b_(i)], and the extrinsic information λ₂[b_(i)] delivered by the channel decoder 260. This extrinsic information is the information about the code bit b_(i) gleaned from the a priori information about the other code bits, {λ₁[b_(i)]}_(1≠i), based on the constraint structure of the code. Note that at the first iteration, the extrinsic information {λ₁[b_(i)]}_(i) and {λ₂[b_(i)]}_(i) are statistically independent. But subsequently, since they use the same information indirectly, they will become more and more correlated and finally the improvement through the iterations will diminish.

2.3 MIMO Channel Estimation

Since the receiver 200 has no knowledge of channel state information in realistic MIMO systems, pilot symbols embedded in the data stream are required to estimate the channel impulse response. Assume T≧n_(T) time slots are used at the beginning of each data burst to transmit known pilot symbols S=[s₁, s₂, . . . , s_(T)]. Denote the corresponding received signal as Y=≡[y₁, . . . , y_(T)]. Then we have

$\begin{matrix} {{Y = {{\sqrt{\frac{\rho}{n_{T}}}{HS}} + V}},} & (10) \end{matrix}$ where V=[v₁, . . . v_(T)]. In [21], two forms of channel estimators at the channel estimator 270 are given, namely, the maximum likelihood (ML) estimator, given by

$\begin{matrix} {{{\hat{H}}_{ML} = {\sqrt{\frac{\rho}{n_{T}}}Y\;{S^{H}\left( {SS}^{H} \right)}^{- 1}}};} & (11) \end{matrix}$ and the minimum mean-square error (MMSE) channel estimator given by

$\begin{matrix} {{\hat{H}}_{MMSE} = {\sqrt{\frac{\rho}{n_{T}}}Y\;{{S^{H}\left( {{\frac{\rho}{n_{T}}{SS}^{H}} + I_{n_{T}}} \right)}^{- 1}.}}} & (12) \end{matrix}$

It is also shown in [21, 22] that the optimal training sequence in the sense of minimizing the channel estimation error should satisfy the following orthogonality condition SS ^(H) =T·I _(n) _(T) .  (13) One way to produce such a training sequence is to use the complex Walsh codes generated by the Hadamard matrix. As shown in FIG. 2, at the end of each turbo iteration, the a posteriori LLRs of the code bits are fed back from the soft channel decoder 260 through the interleaver 280 to the MIMO channel estimator 270 for channel re-estimation. If the LLRs of all 2n_(T) code bits corresponding to one QPSK MIMO symbol are above a predetermined threshold, the corresponding decoded MIMO symbol is considered as having high-reliability and will act as a training symbol for channel reestimation at the beginning of the next turbo iteration. Intuitively, as the turbo receiver 200 iterates, such a scheme obtains more and more accurate channel estimation since it makes use of more and more training symbols. Accordingly, the overall receiver performance will be improved compared with the scheme where the channel is estimated only once using the pilot symbols before the first turbo iteration.

3 Background on Sequential Monte Carlo

Sequential Monte Carlo (SMC) is a class of efficient methods to obtain random samples from a sequence of probability distributions known only up to a normalizing constant. A general framework for the SMC methods is briefly explained next. Consider the following generic sequence of the a posteriori probability distributions {p(X_(k)|Y_(k))}_(k≧0) where X_(k)=(x₀, x₁, . . . , x_(k)) is the set of unobserved parameters to be estimated and Y_(k)=(y₀, y₁, . . . , y_(k)) is the set of available observations at index k. We emphasize the fact that k is not necessarily a time index. Typically a posteriori distributions are only known up to a normalizing constant as p(Y_(k))=∫p(Y_(k)|X_(k))p(X_(k))dX_(k) is not available in closed form and thus SMC is of great interest in this context. Assume one wants to compute the minimum mean-square error (MMSE) estimate of h(X_(k)), which is given by

$\begin{matrix} {{E\left\{ {h\left( X_{k} \right)} \middle| Y_{k} \right\}} = {\int{{h\left( X_{k} \right)}{p\left( X_{k} \middle| Y_{k} \right)}{{\mathbb{d}X_{k}}.}}}} & (14) \end{matrix}$ Given m random samples

{X_(k)^((j))}_(j = 1)^(m) distributed according to p(X_(k)|Y_(k)), this expectation can be approximated numerically through

$\begin{matrix} {{\hat{E}\left\{ {h\left( X_{k} \right)} \middle| Y_{k} \right\}} = {\frac{1}{m}{\sum\limits_{j = 1}^{m}\;{{h\left( X_{k}^{(j)} \right)}.}}}} & (15) \end{matrix}$ Sampling directly from p(X_(k)|Y_(k)) is often not feasible and it is necessary to derive approximate methods.

3.1 Importance Sampling

Sampling directly from p(Xk|Yk) is often not feasible or too computationally expensive, but drawing samples from some trial density “close” to the distribution of interest is often easy. In this case, we can use the idea of importance sampling. Suppose a set of random samples

{X_(k)^((j))}_(j = 1)^(m) distributed according to q(X_(k)|Y_(k)) is available. We can still come up with valid estimates of (14) by correcting for the discrepancy between p(X_(k)|Y_(k)) and q(X_(k)|Y_(k)). By associating the importance weight

$w_{k}^{(j)} = \frac{p\left( X_{k}^{(j)} \middle| Y_{k} \right)}{q\left( X_{k}^{(j)} \middle| Y_{k} \right)}$ to the sample X^((j)) _(k), we can indeed estimate (14) as

${{\hat{E}\left\{ {h\left( X_{k} \right)} \middle| Y_{k} \right\}} = {\frac{1}{W_{k}}{\sum\limits_{j = 1}^{m}\;{w_{k}^{(j)}{h\left( X_{k}^{(j)} \right)}}}}},$ with

$W_{k} = {\underset{j = 1}{\sum\limits^{m}}{w_{k}^{(j)}.}}$ The pair

{X_(k)^((j)), w_(k)^((j))}_(j = 1)^(m) is called a properly weighted sample with respect to the distribution p(X_(k)|Y_(k)). It is crucial to notice that one only needs to compute the weights up to a normalizing constant so that the knowledge of p(Y_(k)) is not necessary. In the “optimal” case where q(X_(k)|Y_(k))=p(X_(k)|Y_(k)), then all the weights are equal and have zero variance. Roughly speaking, the performance of the method typically deteriorates when the variance of the weights increases.

The importance sampling method is a generic method that is not sequential. However, one may come up with a sequential version of it. Suppose a set of properly weighted samples

{X_(k − 1)^((j)), w_(k − 1)^((j))}_(j = 1)^(m) with respect to p(X_(k−1)|Y_(k−1)) is available at index k−1. Based on these samples, we want to generate a new set of samples

{X_(k)^((j)), w_(k)^((j))}_(j = 1)^(m) properly weighted with respect to p(X_(k)|Y_(k)). The Sequential Importance Sampling (SIS) algorithm proceeds as follows.

-   -   Draw a sample x^((j)) _(k) from a trial distribution

q(x_(k)|X_(k − 1)^((j)), Y_(k)) and let

X_(k)^((j)) = (X_(k − 1)^((j)), x_(k)^((j))).

-   -   Update the importance weight

$\begin{matrix} {w_{k}^{(j)} = {w_{k - 1}^{(j)} \cdot {\frac{p\left( X_{k}^{(j)} \middle| Y_{k} \right)}{{q\left( X_{k - 1}^{(j)} \middle| Y_{k - 1} \right)}{q\left( {\left. x_{k}^{(j)} \middle| X_{k - 1}^{(j)} \right.,Y_{k}} \right)}}.}}} & (16) \end{matrix}$

One can check that the above algorithm indeed generates properly weighted samples with respect to the distribution p(X_(k)|Y_(k)). It is indeed simply an importance sampling method with trial distribution satisfying q(X _(k) |Y _(k))=q(X _(k−1) |Y _(k−1))q(χ_(k) |X _(k−1) ,Y _(k)). This procedure is very general but it is crucial to design a “good” trial distribution to obtain good performance. It is easy to establish that the optimal importance distribution—optimal as it minimizes the conditional variance of the weights var {w_(k)|X_(k−1), Y_(k)} [23]—is given by q(χ_(k)|X_(k−1) ,Y _(k))=p(χ_(k) |X _(k−1) ,Y _(k))  (17) In this case, the importance weight recursion is

w_(k)^((j))  ∝ w_(k − 1)^((j)) ⋅ p(y_(k)|X_(k − 1)^((j)), Y_(k − 1)).

3.2 Resampling

The importance weight w^((j)) _(k) measures the “quality” of the corresponding imputed signal sequence X^((j)) _(k). A relatively small weight implies that the sample is drawn far from the main body of the a posteriori distribution and has a small contribution in the final estimation. Such a sample is said to be ineffective. Whatever being the importance distribution used, the SIS algorithm is getting inefficient as k increases. This is because the discrepancy between q(X_(k)|Y_(k)) and p(X_(k)|Y_(k)) can only increase with k. In practice, after a few steps of the procedure, only one stream dominates all the others; i.e., its importance weight is close to 1 whereas the others are close to 0. To make the SIS procedure efficient in practice, it is necessary to use a resampling procedure as suggested in [24]. Roughly speaking, the aim of resampling is to duplicate the streams with large importance weights while eliminating the streams with small ones; i.e., one focuses the computational efforts in the promising zones of the space. Each sample X^((j)) _(k) is copied K _(j) times with

${\underset{j = 1}{\sum\limits^{m}}\kappa_{j}} = {m.}$ Many resampling procedures have been proposed in the literature [11, 14, 15, 24, 25]. We use here the systematic resampling procedure proposed in [25].

-   -   Compute the cumulative distribution p₀=0,p_(j)=p_(j−1)+w^((j))         _(k)/W_(k) for j=1, . . . , m.     -   Draw a random number U uniformly in [0, 1]. Compute         U_(j)=(U+j)/m with j=0, . . . ,m−1.     -   Set K _(j)={#i ε{0, . . . , m−1};p_(j−1)≦U_(i)<p_(j)}.     -   Assign uniform weights

${\hat{w}}_{k}^{(j)} = \frac{1}{m}$ to the new set of streams

{X̂_(n)^((j))}_(j = 1)^(m).

This algorithm is very computationally efficient. In the class of unbiased resampling schemes, i.e., E{K _(j)}=mw^((j)) _(k−1) this algorithm minimizes the variance var {K _(j)} and displays better performance than other algorithms.

Note that if resampling is used in conjunction with the optimal importance distribution (17), one should perform the resampling step before the sampling step at time k as, in this very specific case, the importance weight at time k is independent of the samples x^((j))k.

Resampling can be done at every fixed-length time interval (say, every five time steps or otherwise periodically) or it can be conducted dynamically. The effective sample size is a criterion that can be used to monitor the variation of the importance weights of the sample streams, and to decide when to resample as the system evolves. The effective sample size is defined as

${{\overset{\_}{m}}_{k}\overset{\Delta}{=}\frac{m}{1 + v_{k}^{2}}},$ where v_(k), the coefficient of variation, is given by

${v_{k}^{2} = {\frac{1}{m}{\sum\limits_{j = 1}^{m}\;\left( {\frac{w_{k}^{(j)}}{{\overset{\_}{w}}_{k}} - 1} \right)^{2}}}},$ with

${\overset{\_}{w}}_{k} = {\frac{1}{m}{\sum\limits_{j = 1}^{m}\;{w_{k}^{(j)}.}}}$ Roughly speaking, the effective sample size is a measure of variation over the weights. In the optimal case, where all the weights are equal, then it is maximum and equal to m. When one weight dominates all the others, then it is very small. In dynamic resampling, a resampling step is performed once the effective sample size m _(k), is below a certain threshold. An alternative criterion such as the entropy of the weights could also be used.

Heuristically, resampling can provide chances for good sample streams to amplify themselves and hence “rejuvenate” the samples to produce better results in the next steps. It can be shown that if m is sufficiently large the resampled streams drawn by the above resampling scheme are also properly weighted with respect to p(X_(k)|Y_(k)). In practice, when small to modest m is used (We use m=64 here), the resampling procedure can be seen as a trade-off between the bias and the variance. That is, the new samples with their weights resulting from the resampling procedure are only approximately proper, which introduces small bias in the Monte Carlo estimates. On the other hand, resampling significantly reduces the Monte Carlo variance for the future samples.

3.3 An Alternative Deterministic Procedure

SMC is a very general set of methods to sample probability distributions on any space one wants. We restrict ourselves to a case of great interest in telecommunications; namely the case where x_(k) can only take values in a finite set say X. In this case the a posteriori distribution p(X_(k)|Y_(k)) could be computed exactly but it is typically far too computationally expensive when k is large as X_(k) can take |X|^(k) possible values where |X| is the number of elements in X.

In this very specific case, the SIS procedure with the optimal importance distribution (17) takes the following form.

-   -   Draw a sample x^((j)) _(k) from p(x_(k)|X^((j)) _(k−1), Y_(k))         and let X^((j)) _(k)=(X^((j)) _(k−1), x^((j)) _(k)).     -   Update the importance weight

w_(k)^((j)) ∝ w_(k − 1)^((j)) ⋅ p(y_(k)|X_(k − 1)^((j)), Y_(k − 1)).

The importance weight is computed using

$\begin{matrix} {{p\left( {\left. y_{k} \middle| X_{k - 1} \right.,Y_{k - 1}} \right)} = {\sum\limits_{x_{k} \in \chi}{{p\left( {\left. y_{k} \middle| X_{k} \right.,Y_{k - 1}} \right)}{{p\left( x_{k} \middle| X_{k - 1} \right)}.}}}} & (18) \end{matrix}$ Two points can be criticized about this algorithm. First, assume that the number of streams m is equal to |X|^(l) with l<n. (If l≧k then one could compute p(X_(k)|Y_(k)) exactly). The algorithm should enumerate the first MI possibilities and compute their a posteriori probabilities; i.e., one should compute p(X_(k)|Y_(k)) exactly. This can be easily incorporated in the initialization phase of an SMC algorithm. Second, one can see that the calculation of the weights (18) involves computing the a posteriori distribution up to a normalizing constant of m|X| streams. Indeed for each X^((j)) _(k−1), one needs to compute p(y_(k)|X^((j)) _(k−1), x_(k), Y_(k−1)) for X_(k)εX. By sampling X^((j)) _(k) this information is somehow discarded. An alternative deterministic approach consists of keeping among the m|X| “candidate” trajectories at time

{X_(k − 1)^((j)), x_(k) ∈ χ}_(j = 1)^(m), the m trajectories with the highest a posteriori distribution. This algorithm can be interpreted as an extended Viterbi algorithm with m survivors. Clearly the local approximation error one commits at the selection step using this deterministic strategy is lower than using the randomized strategy. Nevertheless it does not guarantee performance to be better overall.

To sum up, given

{X_(k − 1)^((j))}_(j = 1)^(m) with a posteriori distributions (known up to a normalizing constant)

{w_(k − 1)^((j))}_(j = 1)^(m), the algorithm proceeds as follows at time k.

-   -   Compute         w _(k) ^((j))(χ_(k))∝w _(k−1) ^((j)) ·p(Yk|X _(k−1) ^((j))         ,χk,Yk−1), χkεX.     -   Select and preserve only the m “best” distinct streams amongst         the mm hypotheses.

The drawback of this approach is that clearly the past observations are as important as the current observation in the calculation of the weight. If an error is committed then it has a significant impact on further decisions. The randomized algorithm is less sensitive to this problem as when the streams are resampled their weights are set to 1/m.

It is worth noticing that both the stochastic and deterministic algorithms are well suited for parallel implementation.

4 Soft MIMO Demodulation Algorithms

In this section, we will derive soft-input soft-output MIMO demodulation algorithms based on the sequential Monte Carlo principle. The importance sampling density is obtained by utilizing the artificial sequential structure of the existing simple nulling and cancellation BLAST detection scheme. First, the channel parameters are assumed to have been estimated perfectly at the receiver through the short training sequence before detection. Later, we will examine the effect of channel estimation error on systems performance with pilot symbols along with high-quality symbol decision feedback.

4.1 Simple Nulling and Cancellation BLAST Detection Algorithm

Consider the signal model (2), we denote the QR-decomposition of Ω^(1/2) as Ω^(1/2) =QR,  (20) where Q is a unitary matrix and R is an upper triangular matrix. The nulling operation is a coordinate rotation that left-multiplies the vector u on (2) by Q^(H) to produce a sufficient statistic,

$\begin{matrix} {{z = {{Q^{H}u} = {{\sqrt{\frac{\rho}{n_{T}}}R\; s} + \overset{\sim}{v}}}},} & (21) \end{matrix}$ where {tilde over (v)}=Q^(H)v. Since Q is unitary, there is no noise enhancement and the noise whitening characteristic maintains by nulling, i.e., {tilde over (v)}˜N_(c)(0, 1_(n) _(T) ). We rewrite (21) as

$\begin{matrix} {\underset{\underset{z}{︸}}{\begin{bmatrix} z_{1} \\ z_{2} \\ \vdots \\ z_{n_{T}} \end{bmatrix}} = {{\sqrt{\frac{\rho}{n_{T}}}\underset{\underset{R}{︸}}{\begin{bmatrix} r_{1,1} & r_{1,2} & \cdots & r_{1,n_{T}} \\ \; & r_{2,2} & \cdots & r_{2,n_{T}} \\ \; & \; & ⋰ & \vdots \\ \; & \; & \; & r_{n_{T},n_{T}} \end{bmatrix}}\underset{\underset{s}{︸}}{\begin{bmatrix} s_{1} \\ s_{2} \\ \vdots \\ s_{n_{T}} \end{bmatrix}}} + \underset{\underset{\overset{\sim}{v}}{︸}}{\begin{bmatrix} {\overset{\sim}{v}}_{1} \\ {\overset{\sim}{v}}_{2} \\ \vdots \\ {\overset{\sim}{v}}_{n_{T}} \end{bmatrix}}}} & (22) \end{matrix}$ We can detect data symbols directly by nulling operation, i.e., multiply z by R⁻¹. However in [5], it is shown that by utilizing the upper triangular structure of R, significant improvement over zero-forcing can be obtained with the following successive interference cancellation method:

$\begin{matrix} \begin{matrix} {\mspace{11mu}{{\hat{s}}_{n_{T}} =}} & {{L\left( {\sqrt{\frac{n_{T}}{\rho}}\frac{z_{n_{T}}}{r_{n_{T},n_{T}}}} \right)},} \\ {{\hat{s}}_{n_{T} - 1} =} & {{L\left( {\frac{1}{r_{{n_{T} - 1},{n_{T} - 1}}}\left( {{\sqrt{\frac{n_{T}}{\rho}}z_{n_{T} - 1}} - {r_{{n_{T} - 1},n_{T}}{\hat{s}}_{n_{T}}}} \right)} \right)},} \\ {\mspace{56mu}\vdots} & \; \\ {\mspace{34mu}{{\hat{s}}_{1} =}} & {{L\left( {\frac{1}{r_{1,1}}\left( {{\sqrt{\frac{n_{T}}{\rho}}z_{1}} - {\sum\limits_{k = 2}^{n_{T}}\;{r_{1,k}{\hat{s}}_{k}}}} \right)} \right)},} \end{matrix} & (23) \end{matrix}$ where L(x)=arg min_(SiεA)(|x−s_(i)|). Although the above simple nulling and cancellation scheme has a very low complexity, its performance is much worse than that of the methods based on zero forcing or MMSE nulling and interference cancellation with ordering as well as that of the sphere decoding algorithm (See FIG. 5). In what follows, we develop SMC-based MIMO demodulation algorithms using the above simple nulling and cancellation method as the kernel. As will be seen in Section 5, these new algorithms provide near-optimal performance in both uncoded and coded MIMO systems.

4.2 Stochastic SMC MIMO Demodulator

As from (22), the artificial sequential structure of the simple nulling and interference cancellation scheme due to the upper-triangular structure is well suited for applying the SMC method to MIMO data detection with the particularity of operating on spatial domain starting from antenna n_(T) to antenna 1. Indeed one has

$\begin{matrix} {{{P\left( s \middle| u \right)} = {{P\left( s \middle| z \right)} \propto {\prod\limits_{k = 1}^{n_{T}}\;{{p\left( z_{k} \middle| {\overset{\sim}{S}}_{k} \right)}{P\left( s_{k} \right)}}}}},} & (24) \end{matrix}$ where {tilde over (Z)}_(k)=(z_(k), . . . ,z_(n) _(T) ) and {tilde over (S)}_(k)=(s_(k), . . . , s_(n) _(T) ). We can thus use SMC methods to simulate from the sequence of probability distributions

$\left\{ {p\left( {\overset{\sim}{S}}_{k} \middle| {\overset{\sim}{Z}}_{k} \right)} \right\}_{{k = n_{T}},{n_{T} - 1},\mspace{11mu}\ldots\mspace{11mu},1}.$ This sequence of “artificial” distributions is defined by

$\begin{matrix} {{P\left( {\overset{\sim}{S}}_{k} \middle| {\overset{\sim}{Z}}_{k} \right)} \propto {\prod\limits_{l = k}^{n_{T}}\;{{p\left( z_{l} \middle| {\overset{\sim}{S}}_{l} \right)}{{P\left( s_{l} \right)}.}}}} & (25) \end{matrix}$

The aim of SMC MIMO detection is to compute an estimate of the a posteriori symbol probability P(s _(k) =a _(i) |z),a _(i) εA, k=1, . . . ,n _(T),  (26) based on the received signal z after nulling. Let

${{\overset{\sim}{S}}_{k}^{(j)}\overset{\Delta}{=}\left( {s_{k}^{(j)},\;\ldots\mspace{11mu},s_{n_{T}}^{(j)}} \right)},{j = 1},\;\ldots\mspace{11mu},m,$ be a sample drawn by the SMC at each symbol interval, where m is the number of samples. In order to implement the SMC, we need to obtain a set of Monte Carlo samples of the transmitted symbols {s^((j)) _(k), w^((j)) _(k)} properly weighted with respect to the distribution of p(s|z). In the application of MIMO demodulation, the function h(·) in (15) is specified by the indicator function l(·) as

$\begin{matrix} {{1\left( {x = a} \right)} = \left\{ \begin{matrix} {1,} & {if} & {{x = a},} \\ {0,} & {if} & {x \neq {a.}} \end{matrix} \right.} & (27) \end{matrix}$ Hence, the a posteriori probability of the information symbol s_(k) can then be estimated as

$\begin{matrix} {{{P\left( {s_{k} = \left. a_{i} \middle| z \right.} \right)} = {{E\left\{ {1\left( {s_{k} = a_{i}} \right)} \middle| z \right\}}\mspace{135mu} \cong {\frac{1}{W_{k}}{\sum\limits_{j = 1}^{m}\;{1\left( {s_{k}^{(j)} = a_{i}} \right)w_{k}^{(j)}}}}}},{a_{i} \in A},} & (28) \end{matrix}$ where

$W_{k}\overset{\Delta}{=}{\underset{j = 1}{\sum\limits^{m}}{w_{k}^{(j)}.}}$

Following (17) in Section 3.1, we choose the trial distribution as:

$\begin{matrix} {{q\left( {\left. s_{k} \middle| {\overset{\sim}{Z}}_{k} \right.,{\overset{\sim}{S}}_{k + 1}^{(j)}} \right)} \propto {{P\left( {\left. s_{k} \middle| {\overset{\sim}{Z}}_{k} \right.,{\overset{\sim}{S}}_{k + 1}^{(j)}} \right)}.}} & (29) \end{matrix}$

For this trial distribution, the importance weight is updated according to

$\begin{matrix} \begin{matrix} {w_{k}^{(j)} = {w_{k + 1}^{(j)} \cdot \frac{P\left( {\overset{\sim}{S}}_{k}^{(j)} \middle| {\overset{\sim}{Z}}_{k} \right)}{{P\left( {\overset{\sim}{S}}_{k + 1}^{(j)} \middle| {\overset{\sim}{Z}}_{k + 1} \right)}{P\left( {\left. {\overset{\sim}{S}}_{k}^{(j)} \middle| {\overset{\sim}{Z}}_{k} \right.,{\overset{\sim}{S}}_{k + 1}^{(j)}} \right)}}}} \\ {= {w_{k + 1}^{(j)} \cdot \frac{P\left( {\overset{\sim}{S}}_{k + 1}^{(j)} \middle| {\overset{\sim}{Z}}_{k} \right)}{P\left( {\overset{\sim}{S}}_{k + 1}^{(j)} \middle| {\overset{\sim}{Z}}_{k + 1} \right)}}} \\ {\propto {w_{k + 1}^{(j)} \cdot {{p\left( {\left. z_{k} \middle| {\overset{\sim}{Z}}_{k + 1} \right.,{\overset{\sim}{S}}_{k + 1}^{(j)}} \right)}.}}} \end{matrix} & (30) \end{matrix}$ We next specify the computation of the predictive distribution in (29) and (30). First, we consider the trial distributions in (29)

$\begin{matrix} {{q\left( {\left. s_{k} \middle| {\overset{\sim}{Z}}_{k} \right.,{\overset{\sim}{S}}_{k + 1}^{(j)}} \right)} \propto {{p\left( {\left. z_{k} \middle| s_{k} \right.,{\overset{\sim}{Z}}_{k + 1},{\overset{\sim}{S}}_{k + 1}^{(j)}} \right)} \cdot {P\left( s_{k} \right)}}} & (31) \end{matrix}$ Since the noise v in (21) is white Gaussian i.e., v˜N_(c)(0,1_(n) _(T) ), we have

$\begin{matrix} {{{p\left( {\left. z_{k} \middle| {\overset{\sim}{Z}}_{k + 1} \right.,{\overset{\sim}{S}}_{k + 1}^{(j)},{s_{k} = a_{i}}} \right)} \sim {N_{c}\left( {\mu_{k,i}^{(j)},1} \right)}},} & (32) \end{matrix}$ where the mean μ^((j)) _(k,i) is given by

$\begin{matrix} {\mu_{k,i}^{(j)} = {\left( {{\sqrt{\frac{\rho}{n_{T}}}{\sum\limits_{l = {k + 1}}^{n_{T}}\; r_{k,{l\; s_{l}^{(j)}}}}} + {\sqrt{\frac{\rho}{n_{T}}}r_{k,k}a_{i}}} \right).}} & (33) \end{matrix}$ We denote

$\begin{matrix} {\alpha_{k,i}^{(j)}\overset{\Delta}{=}{{p\left( {\left. z_{k} \middle| {\overset{\sim}{Z}}_{k + 1} \right.,{\overset{\sim}{S}}_{k + 1}^{(j)},{s_{k} = a_{i}}} \right)} \cdot {P\left( {s_{k} = a_{i}} \right)}}} & (34) \\ {\mspace{34mu}{= {\frac{1}{\pi}\exp{\left\{ {- {{z_{k} - \mu_{k,i}^{(j)}}}^{2}} \right\} \cdot {{P\left( {s_{k} = a_{i}} \right)}.}}}}} & (35) \end{matrix}$ Hence, the predictive distribution in (30) is given by

$\begin{matrix} \begin{matrix} {{p\left( {\left. z_{k} \middle| {\overset{\sim}{Z}}_{k + 1} \right.,{\overset{\sim}{S}}_{k + 1}^{(j)}} \right)} \propto {\sum\limits_{a_{i} \in A}{{p\left( {\left. z_{k} \middle| {\overset{\sim}{Z}}_{k + 1} \right.,{\overset{\sim}{S}}_{k + 1}^{(j)},{s_{k} = a_{i}}} \right)}\mspace{20mu}{P\left( {s_{k} = a_{i}} \right)}}}} \\ {= {\sum\limits_{a_{i} \in A}\alpha_{k,i}^{(j)}}} \end{matrix} & (36) \end{matrix}$

Referring to FIG. 3, we summarize the stochastic SMC MIMO demodulation process, which begins at block 300, as follows:

-   -   0. Initialization: all importance weights are initialized as         w^((j)) ⁻¹=1,j=1, . . . , m (block 305).

At block 310, the following steps are implemented at the k-th recursion (k=n_(T), n_(T)−1, . . . , 1) to update each weighted sample. For j=1, . . . , m:

-   -   1. At block 315, compute the trial sampling density α^((j))         _(k,i) for each a_(i)εA according to (35) with the a priori         symbol probability P(s_(k)=a_(i)) obtained from the last turbo         iteration, calculated using (4).     -   2. At block 320, draw a sample s^((j)) _(k) from the set A with         probability

$\begin{matrix} {{{P\left( {{s_{k} = \left. a_{i} \middle| {\overset{\sim}{S}}_{k + 1}^{(j)} \right.},{\overset{\sim}{Z}}_{k}} \right)} \propto \alpha_{k,i}^{(j)}},{a_{i} \in {A.}}} & (37) \end{matrix}$

-   -   3. At block 325, compute the importance weight

$\begin{matrix} {{w_{k}^{(j)} \propto {w_{k + 1}^{(j)} \cdot {\sum\limits_{a_{i} \in A}\alpha_{k,i}^{(j)}}}},} & (38) \end{matrix}$

-   -   4. At block 330, compute the a posteriori probability of the         information symbol s_(k) according to (28).     -   5. At block 335, the process is repeated for the next sample in         the recursion until the last (j=m) sample is reached. At block         340, resampling may be performed as described in Section 3.2.         Note that the resampling does not necessarily occur in the         sequence indicated in FIG. 3. At block 345, the process is         repeated for the next recursion until the last recursion (k=1)         is reached. At block 350, the process ends.

The stochastic and deterministic demodulation processes of the invention may be implemented using any known hardware, firmware and/or software. For example, the demodulator 210 may include an ASIC that is programmed to perform the desired calculations. A program storage device such as a memory may tangibly embody a program of instructions that are executable by a machine such as a microprocessor to perform the demodulation. The program storage device and microprocessor may be part of, or otherwise associated with, the demodulator 210 and possibly other components in the receiver 200. Moreover, the functionality of any of the components in the receiver 200 may be implemented similarly.

4.3 Deterministic SMC MIMO Demodulator

The deterministic method for estimating the sequence of probability distributions

$\left\{ {P\left( {\overset{\sim}{S}}_{k} \middle| {\overset{\sim}{Z}}_{k} \right)} \right\}_{{k = n_{T}},{n_{T} - 1},\mspace{11mu}\ldots\mspace{11mu},1}$ proceeds as follows. Similarly to the stochastic case, we assume that m samples are drawn at each iteration where m=|A|^(l) with l<n_(T). Note that if l=n_(T), it is equivalent to the maximum likelihood detection and involves |A|^(n) ^(T) calculations. In the exploration step, we compute exactly the distribution

$P\left( {\overset{\sim}{S}}_{n_{T} - l + 1} \middle| {\overset{\sim}{Z}}_{n_{T} - l + 1} \right)$ by enumerating all m particles, e.g., samples, for antenna n_(T) down to antenna n_(T)−l+1. As a result, we have a set of m distinct QPSK sequences

$\left\{ {\overset{\sim}{S}}_{n_{T} - l + 1}^{(j)} \right\}_{j = 1}^{m}$ with weights

{w_(n_(T) − l + 1)^((j))}_(j = 1)^(m) satisfying

$\begin{matrix} \begin{matrix} {w_{n_{T} - l + 1}^{(j)} = {P\left( {\overset{\sim}{S}}_{n_{T} - l + 1}^{(j)} \middle| {\overset{\sim}{Z}}_{n_{T} - l + 1} \right)}} \\ {\propto {\prod\limits_{k = {n_{T} - l + 1}}^{n_{T}}\;{{p\left( {\left. z_{k} \middle| {\overset{\sim}{S}}_{k + 1}^{(j)} \right.,{s_{k}^{(j)} = a_{i}}} \right)} \cdot {P\left( {s_{k}^{(j)} = a_{i}} \right)}}}} \\ {{= {\prod\limits_{k = {n_{T} - l + 1}}^{n_{T}}\beta_{k,i}^{(j)}}},} \end{matrix} & (39) \end{matrix}$ where

$\begin{matrix} {\beta_{k,i}^{(j)}\overset{\Delta}{=}{{p\left( {\left. z_{k} \middle| {\overset{\sim}{S}}_{k + 1}^{(j)} \right.,{s_{k}^{(j)} = a_{i}}} \right)} \cdot {P\left( {s_{k}^{(j)} = a_{i}} \right)}}} & (40) \\ {\mspace{40mu}{{= {\frac{1}{\pi}\exp{\left\{ {- {{z_{k} - \mu_{k,i}^{(j)}}}^{2}} \right\} \cdot {P\left( {s_{k}^{(j)} = a_{i}} \right)}}}},}} & (41) \end{matrix}$ and μ^((j)) _(k,i) is given by (33).

The second step is the extended Viterbi algorithm as described in Section 3.3 performed from antenna n_(T)−l to antenna 1 (i.e., k=n_(T)−l, . . . , 1). We need to update the importance weight according to

$\begin{matrix} {{{{w_{k}^{(j)}\left( a_{i} \right)}\mspace{11mu} \propto \mspace{11mu}{{w_{k + 1}^{(j)} \cdot {p\left( {\left. z_{k} \middle| {\overset{\sim}{S}}_{k + 1}^{(j)} \right.;{s_{k}^{(j)} = a_{i}}} \right)}}{P\left( {s_{k}^{(j)} = a_{i}} \right)}}},{a_{i} \in A},}\mspace{14mu}} & (42) \end{matrix}$ where the m distinct QPSK symbol sequences with highest importance weights are selected as the survivor paths over m|A| hypotheses. The deterministic SMC MIMO demodulation algorithm, which begins at block 400, is summarized in FIG. 4 as follows:

-   -   0. Initialization: for k=n_(T), . . . , n_(T)−l+1, calculate the         exact expression of the probability distribution via (41) by         enumerating all m particles (samples) (block 405).

At block 410, the following steps are implemented at the kth recursion (k=n_(T)−l, n_(T)−l−1, . . . , 1) to update each weight sample. For j=1, . . . , m:

-   -   1. At block 415, compute the importance weight         w _(k,i) ^((j)) ∝w _(k−1) ^((j))·β_(k,i) ^((j)),  (43)         where β^((j)) _(k,i) is given by (41).     -   2. At block 420, select and preserve only the m “best” distinct         streams {tilde over (S)}_(k) ^((j)) with the highest weights         amongst the m|A| hypotheses with weights set {w^((j)) _(k,i)}.     -   3. At block 425, compute the a posteriori probability of the         information symbol S_(k) according to (28). At block 430, the         process is repeated for the next sample in the recursion until         the last (j=m) sample is reached. At block 435, the process is         repeated for the next recursion until the last recursion         (k=n_(T)−l+1) is reached. At block 440, the process ends.

5 Simulation Results

In this section, we provide computer simulation results to illustrate the performance of the proposed turbo receivers in flat-fading MIMO channels with n_(T)=n_(R)=8. The fading coefficients are generated according to h_(4j) ^(iid)N_(e)(0, 1), where i.i.d. denotes “independent, identically distributed”. Channels are assumed to be quasi-static, i.e., they remain constant over the entire frame of N symbols, but vary from frame to frame. Simulation results are obtained by averaging over 500 channel realizations. The number of samples drawn in the stochastic SMC algorithm is m=64. The number of samples drawn in the deterministic algorithm is m=64 (i.e., l=3) in the uncoded case and m=16 (i.e., l=2) in the coded case.

5.1 Performance in Uncoded MIMO Systems

We first illustrate the performance of the proposed stochastic and deterministic SMC MIMO demodulation algorithms in an uncoded MIMO system. The BER performance of these two new algorithms, together with some existing detection algorithms (including the sphere decoding algorithm, the method based on MMSE nulling and cancellation with ordering, the method based on zero-forcing nulling and cancellation with ordering, as well as the simple nulling and cancellation method) are shown in FIG. 5. QPSK modulation is used. Several observations are in order. First of all, the simple nulling and cancellation method displays very poor performance. However, when it is combined with SMC (deterministic or stochastic), we obtain a much more powerful MIMO detector. Secondly, the deterministic detector outperforms the stochastic SMC detector and it actually slightly outperforms the sphere decoding algorithms, which is the best suboptimal MIMO detector known so far!

5.2 Performance in Coded MIMO Systems

For the coded MIMO system, a rate-1/2 constraint length-5 convolutional code (with generators 23 and 35 in octal notation) is employed at the transmitter. The bit interleaver is randomly generated and fixed throughout the simulations. The code bit block size is 512, corresponding to 256 information bits and 32 QPSK MIMO symbols. The channel estimator employed here is the MMSE MIMO channel estimator given by (12), where T=n_(T)=8, and QPSK orthogonal MIMO pilot symbols are used. The number of turbo iterations in each simulation run is four.

In FIGS. 6, 7 and 8, the BER performance of the turbo receiver employing the stochastic SMC demodulator is plotted. “Iter” denotes “iteration”, “ch” denotes “channel” and “est.” denotes “estimated”. The solid curves in these figures correspond to the BER performance with perfectly known channel; whereas the dashed curves correspond to the BER performance with different channel estimation schemes. In FIG. 6, the channel is estimated only once based on the pilots and used throughout all turbo iterations. In FIG. 7, the channel is re-estimated at the beginning of each turbo iteration using both the pilots and the decoded symbols with high-reliability, as discussed in Section 2.3. And in FIG. 8, we assume the channel is estimated by a genie that knows both the pilot symbols and all information symbols. Such a genie aid channel estimate is used throughout all turbo iterations. The scenario for FIG. 8 provides an upper bound on the achievable performance for the turbo receivers with different channel estimation schemes. It is seen that by performing iterative channel estimation and symbol detection, the turbo receiver offers a performance approaching the genie-aided bound.

The corresponding BER performance of a turbo receiver employing the deterministic SMC demodulator is shown in FIGS. 9, 10 and 11. It is seen that the deterministic SMC algorithm offers improved and more stable performance than its stochastic counterpart.

6 Conclusions

The invention provides a new family of demodulation algorithms particularly useful for soft-input soft-output MIMO demodulation. These new techniques illustratively use the conventional BLAST detection based on simple nulling and cancellation as the kernel, and are based on the sequential Monte Carlo (SMC) method for Bayesian inference. Two versions of such SMC MIMO demodulation algorithms are developed, based on respectively stochastic and deterministic sampling. As hard MIMO detection algorithms, the proposed SMC demodulation algorithms significantly outperform all existing BLAST detection methods. Moreover, the deterministic SMC MIMO detector slightly outperforms the sphere decoding algorithm. Furthermore, in coded MIMO systems, the proposed SMC algorithms can naturally serve as the soft MIMO demodulator in a turbo receiver. Simulation results indicate that overall the deterministic SMC MIMO demodulator offers better and more stable performance compared with its stochastic counterpart in both the uncoded and the coded MIMO systems.

The invention has been described herein with reference to particular exemplary embodiments. Certain alterations and modifications may be apparent to those skilled in the art, without departing from the scope of the invention. The exemplary embodiments are meant to be illustrative, not limiting of the scope of the invention, which is defined by the appended claims.

APPENDIX—LIST Of REFERENCES

-   [1] G. J. Foschini and M. J. Gans. On the limits of wireless     communications in a fading environment when using multiple antennas.     Wireless Personal Commun., 6(3):311-335, 1998. -   [2] I. E. Telatar. Capacity of multi-antenna Gaussian channels.     Euro. Trans. Telecommun., 10(6):585-595, November 1999. -   [3] C. N. Chuah, D. N. C. Tse, J. M. Kahn, and R. A. Valenzuela.     Capacity scaling in MIMO wireless systems under correlated fading.     IEEE Trans. Inform. Theory, 48(3):637-650, March 2002. -   [4] P. W. Wolniansky, G. J. Roschini, G. D. Golden, and R. A.     VAlenzuela. V-BLAST: an architecture for realizing very high data     rates over the rich-scattering wireless channel. In Proc. 1998 Int.     Symp. Sig. Sys. Elect. (ISSSE'98), Pisa, Italy, September 1998. -   [5] G. J. Foschini. Layed space-time architecture for wireless     communication in a fading environment when using multi-element     antennas. Bell Labs. Tech. J, 1(2):41-59, 1996. -   [6] G. D. Golden, G. J. Foschini, R. A. Valenzuela, and P. W.     Wolniansky. Detection algorithm and initial laboratory results using     V-BLAST space-time communication architecture. Elect. Let.,     35:14-16, January 1999. -   [7] O. Damen, A. Chkeif, and J. Belfiore. Lattice code design for     space-time codes. IEEE Commun. Let., 4(5):161-163, May 2000. -   [8] B. M. Hochwald and S. T. Brink. Achieving near-capacity on a     multiple-antenna channel. Submitted to IEEE Trans. Commun., April     2002. -   [9] A. M. Chan and I. Lee. A new reduced sphere decoder for multiple     antenna systems. In Proc. 2002 Int. Commun. Conf (ICC'02), New York,     N.Y., April 2002. -   [10] A. Doucet, N. DeFreitas, and N. J. Gordon (eds.). Sequential     Monte Carlo Methods in Practice. New York: Springer-Verlag, 620 pp.,     2001. -   [11] X. Wang, R. Chen, and J. S. Liu. Monte Carlo Bayesian signal     processing for wireless communications. J. VLSI Sig. Proc.,     30(1-3):89-105, Jan.-Feb.-March 2002. -   [12] R. Chen, X. Wang, and J. S. Liu. Adaptive joint detection and     decoding in flat-fading channels via mixture Kalman filtering. IEEE     Trans. Info. Theory, 46(6):2079-2094, September 2000. -   [13] R. Chen and J. S. Liu. Sequential Monte Carlo methods for     dynamic systems. J Amer. Stat.Assoc., 93:1302-1044, 1998. -   [14] R. Chen and J. S. Liu. Mixture Kalman filters. J Amer. Stat.     Assoc. (B), 62:493-509, 2000. -   [15] Z. Yang and X. Wang. A sequential Monte Carlo blind receiver     for OFDM systems in frequency-selective fading channels. IEEE Trans.     Sig. Proc., 50(2):271-280, February 2002. -   [16] A. M. Tonello. On turbo equalization of interleaved space-time     codes. In Proc. 2001 Fall Vehi. Tech. Conf (VTC-fall'01), October     2001. -   [17] H. Dai and A. F. Molisch. Multiuser detection for     interference-limited MIMO systems. In Proc. 2001 Spring Vehi. Tech.     Conf (VTC-spring'01), May 2002. -   [18] M. Sellathurai and S. Haykin. TURBO-BLAST for wireless     communications: theory and experiments. IEEE Trans. Sig Proc.,     50(10):2538-2546, October 2002. -   [19] X. Wang and H. V. Poor. Iterative (Turbo) soft interference     cancellation and decoding for coded CDMA. IEEE Trans. Commun.,     47(7):1046-1061, July 1999. -   [20] L. R. Bahl, J. Cocke, F. Jelinek, and J. Raviv. Optimal     decoding of linear codes for minimizing symbol error rate. IEEE     Trans. Info. Theory, 20(3):284287, March 1974. -   [21] T. L. Marzetta. BLAST training: estimation channel     characteristics for high-capacity spacetime wireless. In Proc. 37th     Annual Allerton Conf Commun., Comput. & Control, Sep. 1999. -   [22] Q. Sun, D. C. Cox, A. Lozano, and H. C. Huang. Training-based     channel estimation for continuous flat fading BLAST. In Proc. 2002     Int. Conf Commun. (ICC'02), New York, N.Y., April 2002. -   [23] A. Doucet, S. J. Godsill, and C. Andrieu. On sequential Monte     Carlo sampling methods for Bayesian filtering. Stat. & Comp., 10(3):     197-208, 2000. -   [24] N. J. Gordon, D. J. Salmond, and A. F. M. Smith. Novel approach     to nonlinear non-Gaussian Bayesian state estimation. In 1993 IEE     Proc., F. Radar Sonar and Navigations, 140(2), April 1993. -   [25] G. Kitagawa. Monte Carlo filter and smoother for non-Gaussian     nonlinear state space models. J Comput. Graph. Statist., 5(1):1-25,     1996. 

1. A method for demodulating data from a multiple-input multiple-output (MIMO) channel, comprising: receiving a priori probability values for symbols transmitted across the MTNO channel, said a priori probability values are represented by P(s_(k)=a_(i)), where the symbols in a symbol interval are represented by s_(k), and k is an index identifying a transmit antenna; and a_(i) is an ith value in an alphabet set from which the symbols take their values; in accordance with the a priori probability values, determining a set of Monte Carlo samples of the symbols weighted with respect to a probability distribution of the symbols; and estimating a posteriori probability values for the symbols based on the set of Monte Carlo samples.
 2. The method of claim 1, wherein: the Monte Carlo samples comprise stochastic Monte Carlo samples.
 3. The method of claim 1, wherein: the probability distribution of the symbols is represented by p(s|z), where s is a vector of transmitted signal values for different transmit antennas in a symbol interval, and z is a vector of received signals from the different transmit antennas after nulling.
 4. The method of claim 1, wherein determining the set of Monte Carlo samples of the symbols in a symbol interval, represented by {(s_(k) ^((j)), w_(k) ^((j)))}, comprises: determining a trial sampling density for each ith value, a_(i), in an alphabet set A from which the symbols take their values, using the a priori probability value P(s_(k)=a_(i)) from a previous iteration, where the symbols are represented by s_(k), and k is an index identifying a transmit antenna; drawing the jth sample symbol s_(k) ^((j)), from the alphabet set A, where j=1,2, . . . ,m, and m is a number of the Monte Carlo samples determined for the symbol interval; and computing an importance weight w_(k) ^((j)) for s_(k) ^((j)).
 5. The method of claim 4, further comprising: performing resampling to obtain updated importance weights w_(k) ^((j)).
 6. The method of claim 4, further comprising: initializing the importance weights w⁻¹ ^((j))=1.
 7. The method of claim 1, wherein: m is a number of the Monte Carlo samples determined for a symbol interval; the Monte Carlo samples are represented by {(s_(k) ^((j))), w_(k) ^((j)))}, each a posteriori probability value P(s_(k)=a_(i)|z) is obtained from ${{P\left( {s_{k} = \left. a_{i} \middle| z \right.} \right)} = {\frac{1}{W_{k}}{\sum\limits_{j = 1}^{m}\;{1\left( {s_{k}^{(j)} = a_{i}} \right)w_{k}^{(j)}}}}},{a_{i} \in A}$ where z is a vector of received signals from different transmit antennas after nulling; the symbols are represented by S_(k), where k is an index identifying a transmit antenna; importance weights for the symbols S_(k) are represented by w_(k); A is an alphabet set from which the symbols take their values, and a_(u) is an ith value in A; ${W_{k}\overset{\Delta}{=}{\underset{j = 1}{\sum\limits^{m}}w_{k}^{(j)}}};$ and 1 is an indicator function defined by ${1\left( {x = a} \right)} = \left\{ \begin{matrix} {1,{{{if}\mspace{14mu} x} = a},} \\ {0,{{{if}\mspace{14mu} x} \neq {a.}}} \end{matrix} \right.$
 8. The method of claim 1, further comprising: based on the a posteriori probability values, calculating a posteriori log-likelihood ratios of interleaved code bits.
 9. The method of claim 1, wherein: the Monte Carlo samples comprise deterministic Monte Carlo samples.
 10. The method of claim 1, wherein determining the set of Monte Carlo samples of the symbols in a symbol interval, represented by {(s_(k) ^((j)), w_(k) ^((j)))}, comprises: calculating an exact expression for the probability distribution by enumerating m samples for less than all transmit antennas to obtain m data sequences, where m is a number of the Monte Carlo samples determined for the symbol interval; computing the importance weight w_(k) ^((j)) for each symbol s_(k) ^((j)), where k is an index identifying a transmit antenna; and selecting and preserving m distinct data sequences with the highest weights.
 11. A program storage device tangibly embodying a program of instructions executable by a machine to perform a method for demodulating data from a multiple-input multiple-output (MIMO) channel, the method comprising: receiving a priori probability values for symbols transmitted across the MIMO channel, said a priori probability values being represented by P(s_(k)=a_(i)), where the symbols in a symbol interval are represented by s_(k), and k is an index identifying a transmit antenna; and a_(i) is an ith value in an alphabet set from which the symbols take their values; in accordance with the a priori probability values, determining a set of Monte Carlo samples of the symbols weighted with respect to a probability distribution of the symbols; and estimating a posteriori probability values for the symbols based on the set of Monte Carlo samples.
 12. A demodulator for demodulating data from a multiple-input multiple-output (MIMO) channel, comprising: means for receiving a priori probability values for symbols transmitted across the MIMO channel, said a PRIORI probability values being represented by P(s_(k)=a_(i)), where the symbols in a symbol interval are represented by s_(k), and k is an index identifying a transmit antenna; and a_(i) is an ith value in an alphabet set from which the symbols take their values; means for determining, in accordance with the a priori probability values, a set of Monte Carlo samples of the symbols weighted with respect to a probability distribution of the symbols; and means for estimating a posteriori probability values for the symbols based on the set of Monte Carlo samples.
 13. The demodulator of claim 12, wherein: the Monte Carlo samples comprise stochastic Monte Carlo samples.
 14. The demodulator of claim 12, wherein: the Monte Carlo samples comprise deterministic Monte Carlo samples.
 15. A method for demodulating data from a channel, the channel comprising a multiple-input multiple-output (MIMO) channel, the method comprising: (a) receiving a priori probability values for symbols transmitted across the channel; (b) in accordance with the a priori probability values, determining a set of deterministic Monte Carlo samples of the symbols in a symbol interval, represented by {s_(k) ^((j), w) _(k) ^((j)))}, weighted with respect to a probability distribution of the symbols, by: (b1) calculating an exact expression for the probability distribution by enumerating m samples for less than all transmit antennas to obtain m data sequences, where m is a number of the deterministic Monte Carlo samples determined for the symbol interval; (b2) computing the importance weight w_(k) ^((j)) for each symbol s_(k) ^((j)), where k is an index identifying a transmit antenna; and (b3) selecting and preserving m distinct data sequences with the highest weights; and (c) estimating a posteriori probability values for the symbols based on the set of deterministic Monte Carlo samples; wherein: (d) the probability distribution of the symbols is represented by p(s|z), where s is a vector of transmitted signal values for different transmit antennas in a symbol interval, and z is a vector of received signals from the different transmit antennas after nulling.
 16. A method for demodulating data from a channel, the channel comprising a multiple-input multiple-output (MIMO) channel, the method comprising: (a) receiving a priori probability values for symbols transmitted across the channel; (b) in accordance with the a priori probability values, determining a set of deterministic Monte Carlo samples of the symbols in a symbol interval, represented by {(s_(k) ^((j)), w_(k) ^((j)))}, weighted with respect to a probability distribution of the symbols, by: (b1) calculating an exact expression for the probability distribution by enumerating m samples for less than all transmit antennas to obtain m data sequences, where m is a number of the deterministic Monte Carlo samples determined for the symbol interval; (b2) computing the importance weight w_(k) ^((j)) for each symbol s_(k) ^((j)), where k is an index identifying a transmit antenna; and (b3) selecting and preserving m distinct data sequences with the highest weights; (c) estimating a posteriori probability values for the symbols based on the set of deterministic Monte Carlo samples; wherein: (d) wherein the probability distribution of the symbols is represented by p(s|z), where s is a vector of transmitted signal values for different transmit antennas in a symbol interval, and z is a vector of received signals from the different transmit antennas after nulling; (e) wherein m is a number of the deterministic Monte Carlo samples determined for a symbol interval; each a posteriori probability value P(s_(k)=a_(i)|z) is obtained from ${{P\left( {s_{k} = \left. a_{i} \middle| z \right.} \right)} = {\frac{1}{W_{k}}{\underset{j = 1}{\sum\limits^{m}}{1\left( {s_{k}^{(j)} = a_{i}} \right)w_{k}^{(j)}}}}},{a_{i} \in A}$ where z is a vector of received signals from different transmit antennas after nulling; A is an alphabet set from which the symbols take their values, and a_(i) is an ith value in A; ${W_{k}\overset{\Delta}{=}{\underset{j = 1}{\sum\limits^{m}}w_{k}^{(j)}}};$ and 1 is an indicator function defined by ${1\left( {x = a} \right)} = \left\{ \begin{matrix} {1,{{{if}\mspace{14mu} x} = a},} \\ {0,{{{if}\mspace{14mu} x} \neq a},} \end{matrix} \right.$ and (f) calculating, based on the a posteriori probability values, aposteriori log-likelihood ratios of interleaved code bits. 